A WFST-based log-linear framework for speaking-style transformation
نویسندگان
چکیده
●Objective: Transform spoken-style language (V) into written style language (W) for the creation of transcripts ●Approach: Statistical machine translation to “translate” from verbatim text to written text ●Innovations: ●Log-linear modeling for improved accuracy ●Introduction of features to handle common phenomena in speaking-style transformation ●WFST-based implementation for integration with WFST-based speech recognizers ●Evaluation on transformation of Japanese verbatim transcripts showed improvement over traditional methods
منابع مشابه
Language model adaptation using WFST-based speaking-style translation
This paper describes a new approach to language model adaptation for speech recognition based on the statistical framework of speech translation. The main idea of this approach is to compose a weighted finite-state transducer (WFST) that translates sentence styles from in-domain to out-of-domain. It enables to integrate language models of different styles of speaking or dialects and even of dif...
متن کاملA monotonic statistical machine translation approach to speaking style transformation
This paper presents a method for automatically transforming faithful transcripts or ASR results into clean transcripts for human consumption using a framework we label speaking style transformation (SST). We perform a detailed analysis of the types of corrections performed by human stenographers when creating clean transcripts, and propose a model that is able to handle the majority of the most...
متن کاملAutomatic Transcription of Lecture Speech using Language Model Based on Speaking-Style Transformation of Proceeding Texts
For language modeling of spontaneous speech recognition, we propose a style transformation approach, which transforms written texts to a spoken-style language model. Since these two styles are largely different and thus direct transformation is difficult, we cascade two transformation methods; rule-based transformation to rewrite written-style texts to intermediate “verbatim” texts, and statist...
متن کاملInvestigation on the effects of ASR tuning on speech translation performance
In this paper we describe some of our recent investigations into ASR and SMT coupling issues from an ASR perspective. Our study was motivated by several areas: Firstly, to understand how standard ASR tuning procedures effect the SMT performance and whether it is safe to perform this tuning in isolation. Secondly, to investigate how vocabulary and segmentation mismatches between the ASR and SMT ...
متن کاملTwo vocoder techniques for neutral to emotional timbre conversion
In this paper, we describe the application of two vocoder techniques for an experiment of spectral envelope transformation. We processed speech data in a neutral standard reading style in order to reproduce the spectral shapes of two emotional speaking styles: happy and sad. This was achieved by means of conversion functions which operate in the frequency domain and are trained with aligned sou...
متن کامل